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Viral  infections  involve  a  complex  interplay  of  the  immune 
response  and  escape  mutation  of  the  virus  quasispecies  inside  a 
single  host.  Although  fundamental  aspects  of  such  a  balance  of 
mutation  and  selection  pressure  have  been  established  by  the  qua¬ 
sispecies  theory  decades  ago,  its  implications  have  largely  re¬ 
mained  qualitative.  Here,  we  present  a  quantitative  approach  to 
model  the  virus  evolution  under  cytotoxic  T-lymphocyte  immune 
response.  The  virus  quasispecies  dynamics  are  explicitly  repre¬ 
sented  by  mutations  in  the  combined  sequence  space  of  a  set  of 
epitopes  within  the  viral  genome.  We  stochastically  simulated  the 
growth  of  a  viral  population  originating  from  a  single  wild-type 
founder  virus  and  its  recognition  and  clearance  by  the  immune  re¬ 
sponse,  as  well  as  the  expansion  of  its  genetic  diversity.  Applied  to 
the  immune  escape  of  a  simian  immunodeficiency  virus  epitope, 
model  predictions  were  quantitatively  comparable  to  the  experi¬ 
mental  data.  Within  the  model  parameter  space,  we  found  two 
qualitatively  different  regimes  of  infectious  disease  pathogenesis, 
each  representing  alternative  fates  of  the  immune  response:  It  can 
clear  the  infection  in  finite  time  or  eventually  be  overwhelmed  by 
viral  growth  and  escape  mutation.  The  latter  regime  exhibits  the 
characteristic  disease  progression  pattern  of  human  immunodefi¬ 
ciency  virus,  while  the  former  is  bounded  by  maximum  mutation 
rates  that  can  be  suppressed  by  the  immune  response.  Our  results 
demonstrate  that,  by  explicitly  representing  epitope  mutations 
and  thus  providing  a  genotype-phenotype  map,  the  quasispecies 
theory  can  form  the  basis  of  a  detailed  sequence-specific  model  of 
real-world  viral  pathogens  evolving  under  immune  selection. 
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Viruses  with  RNA  genomes,  such  as  human  immunodeficiency 
virus  type  1  (HIV-1),  have  high  rates  of  mutation  and  evolve 
rapidly  in  response  to  host  immune  selection  pressure.  One  of  the 
consequences  of  such  rapid  mutations  is  the  error  catastrophe 
(1),  where  a  virus  population  is  driven  to  extinction  when  its  mu¬ 
tation  rate  exceeds  a  threshold.  The  existence  of  such  a  threshold 
is  a  central  prediction  of  the  quasispecies  theory  pioneered  by 
Eigen  (1)  and  Swetina  and  Schuster  (2).  The  recent  experimental 
demonstration  of  lethal  mutagenesis  (3-5),  in  which  an  error  cat¬ 
astrophe  transition  is  caused  by  mutagens,  demonstrates  that  key 
features  of  the  balance  of  mutation  and  selection  in  viruses  are 
elegantly  captured  by  the  quasispecies  theory. 

Although  insights  provided  by  the  quasispecies  theory  have 
greatly  expanded  our  understanding  of  virus  behavior,  applica¬ 
tions  so  far  have  been  limited  to  a  conceptual  level,  partially  due 
to  the  lack  of  experimental  information  on  the  evolutionary  dy¬ 
namics  in  sequence  space.  Next-generation  sequencing  techni¬ 
ques  (6)  have  the  potential  to  change  this  situation.  To  build 
quantitative  models  implementing  the  quasispecies  dynamics 
and  describe  such  experimental  data,  it  is  important  to  realisti¬ 
cally  specify  the  nature  of  selection  pressure.  Viruses  in  animal 
hosts  evolve  under  immune  pressure,  and  their  capacity  for  rapid 
escape  mutation  underlies  many  of  the  difficulties  in  combating 
pathogens,  including  HIV-1.  In  a  typical  disease  pathogenesis  of 
HIV-1,  the  acute  viremia  after  an  initial  infection  is  curbed  by 


CD8  +  cytotoxic  T-lymphocyte  (CTL)  responses  as  well  as  subse¬ 
quent  antibody  actions,  leading  to  an  asymptomatic  chronic  infec¬ 
tion  stage  that  can  last  up  to  10  y  (7).  However,  this  apparent 
control  of  viremia  is  never  complete,  and  without  antiviral  ther¬ 
apy,  the  chronic  infection  eventually  leads  to  the  onset  of  disease. 
This  chronic  infection  stage  involves  continuous  escape  mutation- 
CTL  response  cycles,  whose  detailed  characteristics  are  being  un¬ 
covered  by  ultradeep  sequencing  (8-12).  CTLs  recognize  specific 
viral  epitopes  (approximately  10  amino  acids  long)  presented  on 
the  surface  of  infected  cells  by  class  I  human  leukocyte  antigen 
(HLA).  The  epitope  recognition  depends  sensitively  on  HLA  al¬ 
leles,  leading  to  differential  patterns  of  immune  response  among 
patients  (13-15),  while  characteristics  of  immune  response  during 
early  infection  often  shape  and  influence  the  overall  disease  pro¬ 
gression  (16,  17).  Quantitative  sequence -based  models  of  virus- 
CTL  dynamics  will  greatly  facilitate  the  interpretation  of  experi¬ 
mental  data. 

In  a  series  of  pioneering  works  (18-20),  Nowak  and  coworkers 
introduced  population  dynamics  concepts  that  capture  a  diverse 
range  of  immune  response  and  escape  mutation.  Similar  ap¬ 
proaches  focused  more  on  escape  dynamics  have  also  been  pro¬ 
posed  (21, 22).  These  models,  however,  do  not  describe  mutations 
in  sequence  space  explicitly.  Other  mathematical  models  of 
sequence  evolution  focus  on  sequence  divergence  during  the  very 
early  stages  of  infection,  while  ignoring  or  only  implicitly  including 
the  effect  of  selection  pressure  (23, 24).  In  this  paper,  we  describe 
a  quantitative  quasispecies-based  model  of  virus  dynamics  under 
T  cell-based  immune  pressure,  explicitly  mapping  genotype-phe¬ 
notype  relationships.  Our  approach  combines  the  description  of 
virus  evolutionary  dynamics  provided  by  the  quasispecies  theory 
and  the  population  dynamics  models  of  Nowak  and  coworkers. 
We  show  that  the  model  not  only  captures  the  salient  features 
observed  in  sequencing  data  of  simian  immunodeficiency  virus 
(SIV)  (9)  but  also  reveals  general  qualitative  features  of  viral 
infection  disease  pathogenesis:  The  immune  system  can  clear 
the  infection  within  time  scales  ranging  from  days  to  those  much 
longer  than  patient  lifetimes,  or  be  overwhelmed  by  immune  es¬ 
cape.  The  viral  load  progression  in  the  latter  regime  closely 
matches  the  observed  HIV-1  disease  pathogenesis  pattern. 

Results  and  Discussion 

Stochastic  Quasispecies  Dynamics.  During  acute  infection,  the  foun¬ 
der  virus  (often  a  single  virion)  (7,  11,  17,  25)  undergoes  replica¬ 
tions  to  produce  an  exponentially  growing  population,  which  may 
be  described  by  the  quasispecies  dynamics  without  degradation, 
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fij  =  2 Qnrini >  [i] 

i 

where  w*  is  the  number  of  virions  with  genotype  i,  rt  is  the  repli¬ 
cation  rate  of  i,  and  Qjt  =  (1  -  q)L-d(/,9(q/3)^04)  is  the  mutation 
probability  from  i  to  genotype  j,  with  mean  mutation  rate  p, 
genome  length  L  (in  bp),  and  Hamming  distance  (the  number 
of  nucleotides  that  are  different)  d(j,  i)  between  j  and  i.  A  more 
complete  description  for  early  stages  of  infection,  where  nt  is  of 
the  order  of  1  and  restricted  to  integer  values,  is  given  by  the  che¬ 
mical  reaction  representation 

R^R.  +  R  [2] 

Qjlrl 

where  R,  is  a  virion  with  genotype  i.  Eq.  2  can  be  simulated 
stochastically  with  the  Gillespie  algorithm  (26).  A  major  advan¬ 
tage  of  such  a  stochastic  formulation  is  that  it  allows  us  to  avoid 
exhaustive  enumerations  of  all  possible  genotypes.  This  is  signif¬ 
icant  even  for  a  single  epitope  of  viral  proteins:  An  amino  acid 
sequence  with  La  =  10,  whose  genome  length  is  L  =  3La  = 
30  bp,  still  has  4L  ~  10 18  genotypes.  Nowak  and  Schuster  (27) 
performed  Gillespie  algorithm  simulations  of  the  standard  qua¬ 
sispecies  model  under  the  single-peak  fitness  landscape  ( rt  =  1  /x 
for  the  wild-type  (WT)  sequence  and  rt  =  l /At  for  all  other  mu¬ 
tants,  where  A  >  1  is  the  relative  fitness  of  the  WT  and  x  is  a  char¬ 
acteristic  time  scale).  In  this  case,  distributions  of  individuals 
among  only  two  groups  (WT  and  mutants)  need  to  be  tracked. 

In  our  numerical  simulations,  the  initial  genotype  (referred  to 
here  as  the  WT)  replicates  with  mutation  rate  p,  and  the  gener¬ 
ated  mutants  are  compared  with  the  existing  list  of  sequences, 
which  is  updated  dynamically  when  new  sequences  arise.  We  ver¬ 
ified  that  this  numerical  scheme  sampled  the  relevant  sequence 
space  sufficiently  (SI  Text,  Figs.  S1-S3).  The  initial  condition  we 
used  (single  WT)  and  the  discreteness  of  nt  imply  that  nt  should 
be  interpreted  as  the  total  number  of  virions  within  a  finite  sys¬ 
tem.  The  HIV-1  viral  load  during  the  acute  infection  phase  can 
reach  uptol04~106  RNA  copies/mL  (7).  Accounting  for  the 
volume  of  blood  in  the  body  of  an  average  adult  (approximately 
5  L),  the  total  population  size  Nv  =  E,  nt  would  be  up  to  about 
109.  We  found  simulations  to  slow  down  significantly  as  Nv  be¬ 
came  larger  than  about  106.  For  computational  efficiency,  we 
therefore  regarded  the  system  as  a  small  representative  volume 
(1  mL)  of  blood,  and  the  viral  loads  and  CTL  levels  reported  refer 
to  RNA  copy  numbers  and  cell  counts  within  this  volume. 

CTL  Response.  Within  an  infected  host,  the  action  of  the  cellular 
immune  response  provides  the  major  force  countering  the  acute 
viremia.  In  this  work,  we  combine  the  population  dynamics  ap¬ 
proach  of  virus-immune  system  dynamics  by  Nowak  and  cowor¬ 
kers  (18,  19)  with  the  classic  quasispecies  theory.  Accordingly, 
Eq.  2  is  modified  as  follows: 

R i  R i  +  R,,  (replication  with  mutation)  [3 A] 

Qjiri 

l 

R/^R,  +  Y  C(aik),  (stimulation)  [3B] 

b  ^ 

k=  1 

R,  +  C(aik)^C(aik),  (clearance)  [3C] 

p 

C(a)-»0,  (decay)  [3D] 

8 

where  the  genotype  i  is  now  a  member  of  the  combined  sequence 
space  of  /  epitopes.  The  amino  acid  sequence  (“phenotype”)  of 
epitope  k  in  genotype  i  is  denoted  as  aik,  and  C(a)  represents  a 


CTL  specific  to  phenotype  a.  Eq.  3B  represents  a  reaction  in 
which,  with  a  rate  b,  virions  of  sequence  i  stimulate  the  produc¬ 
tion  of  a  set  of  CTLs  C(a^)  corresponding  to  the  phenotypes  of 
its  epitopes.  Eq.  3C  denotes  the  reaction  where  cells  infected  with 
viruses  with  genotype  i  are  cleared  by  CTLs  with  phenotypes  that 
match  one  of  its  epitopes  with  a  rate  p ,  and  Eq.  3D  represents  the 
natural  decay  of  T  cells  with  a  rate  g.  Under  the  alternative  de¬ 
terministic  continuum  approximation,  one  may  write: 


i 


YjQarini  ~P  E  C 

i  k=  1 

[4A] 

c(a)  =  bna  - gc(a), 

[4B] 

where  c( a)  is  the  number  of  T  cells  C(a)  and  na  is  the  total  num¬ 
ber  of  virions  containing  epitopes  with  phenotype  a. 

The  qualitative  features  of  the  infection-clearance  dynamics 
can  be  illustrated  by  setting  the  mutation  rate  p  to  zero.  Numer¬ 
ical  integrations  of  Eq.  4  yield  the  continuum  approximation  re¬ 
sult  (Fig.  S4  for  p  =  0  and  /  =  1),  which  showed  reasonable 
agreement  with  the  stochastic  simulation  results.  For  nonvanish¬ 
ing  mutation  rates,  numerical  integration  rapidly  becomes  prohi¬ 
bitive  with  increasing  L  because  of  the  high  dimensionality  of  the 
sequence  space.  The  inverse  of  b  is  a  measure  of  the  time  delay  of 
the  growth  of  T  cells  compared  to  that  of  viruses.  Here,  we 
adopted  b  values  of  approximately  0.01  d-1.  The  efficiency  of 
CTLs  in  recognizing  cells  presenting  the  epitope  and  killing  them 
is  represented  by  parameter  p,  the  clearance  rate.  We  found 
p  values  of  10-4  ~  10-3  d-1  to  give  good  fits  to  experiments 
(see  below). 

Immune  Escape  Dynamics.  To  make  comparisons  with  experiments, 
it  is  necessary  to  consider  that  fitness  is  determined  by  amino  acid 
sequences,  and  many  mutations  are  synonymous.  Therefore,  the 
full  immune  escape  dynamics  represented  by  Eq.  3  distinguishes 
the  amino  acid  sequence  aik  corresponding  to  the  nucleotide 
sequence  of  epitope  k  in  genotype  i.  We  refer  to  the  set  of  nu¬ 
cleotide  and  amino  acid  sequences  as  “genotypes”  and  “pheno¬ 
types,”  respectively.  The  fitness  rt  is  a  function  of  phenotypes 
only.  During  simulations,  each  genotype  is  translated  into  a  phe¬ 
notype,  and  a  new  mutant  is  checked  against  the  existing  pheno¬ 
type/genotype  list,  which  is  updated  and  expanded.  The  nature  of 
this  genotype-phenotype  map  plays  important  roles  in  evolution¬ 
ary  dynamics,  leading  to  key  signatures  of  selective  pressure,  such 
as  the  ratio  of  synonymous  to  nonsynonymous  mutations  (28). 
Quantitative  insights  to  the  effects  of  this  mapping  on  molecular 
evolution  have  been  provided  by  Manrubia  and  coworkers  (29), 
who  studied  short  RNA  sequences  for  which  fitness  can  be  esti¬ 
mated  by  secondary  structure  predictions.  In  our  case,  a  simpli¬ 
fied  overview  of  the  genotype-phenotype  map  can  be  gleaned 
from  a  network  representation  of  the  genetic  code  (Fig.  S5):  If 
all  mutations  were  neutral  and  amino  acids  within  an  epitope  in¬ 
dependent,  this  map  would  be  sufficient  to  determine  the  statis¬ 
tics  of  evolutionary  drifts.  The  accessibility  of  certain  mutations 
from  a  phenotype,  in  particular,  has  been  shown  to  affect  the 
evolvability  of  viruses  (30). 

Another  major  ingredient  for  a  realistic  model  of  immune  es¬ 
cape  is  the  fitness  landscape  beyond  the  level  of  the  single-peak 
model.  Much  progress  has  been  made  recently  in  understanding 
the  nature  of  fitness  landscapes  (31-34).  Explicit  fitness  measure¬ 
ments  of  viral  clones  (35,  36)  and  biochemical  assays  of  proteins 
(37)  both  indicate  that  single-nucleotide  substitutions  lead  to  a 
broad  distribution  of  fitness  changes,  most  of  which  are  deleter¬ 
ious.  Therefore,  one  may  assume  that  the  fitness  of  a  mutant  is  a 
random  variable  centered  around  a  mean  fitness  value  /.  We  ex¬ 
pect  this  mean  fitness  to  be  a  decreasing  function  of  distance  d  to 
the  WT  (the  number  of  amino  acids  that  are  different),  which  we 
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Fig.  1.  Simulation  of  full  immune  escape  dynamics  of  a  single  SIV  epitope 
Tat28-35SL8.  (/A)  Viral  load.  ( B )  Frequencies  of  WT  and  the  dominant  escape 
mutants  (EM).  Symbols  represent  experimental  data  from  refs.  9  and  22,  with 
each  symbol  corresponding  to  one  of  four  different  animals.  Solid  lines  are 
from  our  stochastic  quasispecies  model  with  t  =  0.8  d,  p  =  2.0  x  10~7  bp-1, 
b  =  0.01  d-1,  g  =  0.2  d_1,  and  p  =  2.0  x  10-4  d-1.  The  fitness  function  para¬ 
meters  were  a  =  0.1  and  \  =  1  (Methods).  The  EM  frequency  is  for  a  single 
trajectory  while  the  rest  are  averages. 

assume  to  be  exponential:  f(d)  =  exp (-d/Q,  where  £  denotes  a 
characteristic  distance  (Methods). 

An  empirical  evidence  for  this  choice  can  be  found  from  a  re¬ 
cent  experimental  study  by  Fernandez  et  al.  (37),  who  measured 
the  fitness  landscapes  of  HIV-1  protease  quasispecies  for  three 
patients.  We  plotted  the  distribution  of  their  reported  fitness 
values  as  a  function  of  distance  to  the  dominant  phenotype 
and  found  near-exponential  dependence  for  two  quasispecies 
(Fig.  S6).  Our  fitness  function  therefore  models  both  the  de¬ 
crease  of  the  mean  fitness  away  from  WT  and  the  distribution 
of  neutral,  deleterious,  and  beneficial  mutants  for  a  given  dis¬ 
tance.  In  simulations,  these  fitness  values  were  assigned  dynami¬ 
cally  to  newly  encountered  phenotypes.  It  is  important  to  note, 
however,  that  this  landscape  chosen  is  still  an  approximation  that 
ignores  many  potentially  important  effects,  such  as  the  heteroge¬ 
neity  of  the  mutational  neighborhoods  within  the  phenotype 
space. 

We  tested  this  quasispecies  model  with  the  ultradeep  sequen¬ 
cing  data  (9, 22)  of  the  SIV  epitope  Tat28-35SL8  (38).  Fig.  1  shows 
the  time  dependence  of  the  viral  load  and  WT  frequency  from 
simulations  of  the  single-epitope  version  of  Eq.  3  compared  with 
the  experimental  data  (22).  The  initial  rapid  growth  of  the  viral 
load,  its  subsequent  decrease  as  CTLs  are  activated,  and  the  more 
gradual  increase  as  mutants  appear  are  all  captured  quantita¬ 
tively.  The  WT  frequency  decrease  agrees  with  experimental 
trends  (Fig.  IB)  with  signatures  of  two  distinct  time  scales  (22). 
The  mutation  rate  we  used  (p  =  2x  10-7  bp-1)  was  chosen  to 
obtain  the  best  overall  agreement  for  WT  frequency  with  the  sin¬ 
gle-epitope  version  of  the  model.  For  multiepitope  applications, 
larger  mutation  rate  values  close  to  the  experimental  estimate 
for  HIV-1  (p  =  3.4  x  10-5  bp-1)  (39)  gave  realistic  dynamics 
(Fig.  2). 

Fig.  IB  also  shows  the  frequency  of  the  most  dominant  escape 
mutant  (EM)  at  each  time  point.  As  mutations  occur,  a  new  strain 
proliferates  temporarily  before  being  curbed  down,  and  the  total 
number  of  phenotypes  continues  to  increase.  The  increase  in  via¬ 
bility  of  populations  with  multiple  but  lower  fitness  peaks  has 
been  described  first  by  Schuster  and  Swetina  (40)  and  has  later 
been  known  as  the  “survival  of  the  flattest”  effect  (41).  Further 
insights  into  the  role  this  expansion  in  sequence  space  plays  in  the 
growth  of  viral  loads  can  be  gained  from  the  deterministic  con¬ 
tinuum  approximation,  Eq.  4,  without  mutation:  The  immune  re¬ 
sponse  to  WT  leads  to  damped  oscillations  of  viral  load  and  CTL 


Fig.  2.  Typical  disease  progression  patterns  in  the  runaway  regime  from  the 
multiepitope  quasispecies  model.  (A)  Viral  load.  ( B )  Total  CTL  levels.  (C)  Num¬ 
ber  of  distinct  phenotypes  per  epitope  present  in  the  population.  The  inset  in 
A  shows  the  viral  load  in  linear  scale  for  p  =  1.0  x  10-3  d-1  and  p  =  1.0  x 
10-5  bp-1.  Other  parameter  values  were  b  =  0.02  d_1,  g  =  0.1  d_1,  t  =  1  d, 
o  =  0.2,  and  \  =  1.  The  units  of  p  and  p  in  the  legends  are  d-1  and  bp-1, 
respectively.  All  data  represent  averages  over  trajectories. 

level  (Fig.  S4),  reaching  a  steady  state  n*  and  c*.  This  process  is 
repeated  for  each  new  mutant,  leading  to  continued  increases 
in  n*  and  c *  roughly  proportional  to  the  number  of  phenotypes 
(18).  However,  the  timing,  distribution,  and  the  probability  of 
the  appearance  of  mutants  are  highly  nontrivial  functions  of 
the  characteristics  of  the  system,  including  genome  length,  gen¬ 
otype-phenotype  map,  and  fitness  landscape.  Our  quantitative 
quasispecies-based  model,  Eq.  3,  provides  a  realistic  description 
of  this  complex  diversification  process  via  the  single  parameter  p. 
We  note  that,  for  a  single  epitope,  this  interpretation  ignores  the 
effects  of  competition  among  viral  strains  because  each  CTL 
is  specific  to  only  one  phenotype.  In  reality,  in  addition  to  com¬ 
peting  directly  for  host  cells,  different  viral  strains  share  their  vul¬ 
nerability  to  CTLs  specific  to  common  epitopes  within  their 
genomes.  By  considering  multiple  epitopes  within  a  strain,  our 
model  takes  into  account  this  interdependence  of  viral  strains 
and  their  indirect  competition  arising  from  shared  epitope  phe¬ 
notypes  (see  below). 

As  a  quantitative  measure  of  the  relative  degree  of  WT 
persistence  during  immune  escape,  we  also  examined  the  time 
(half-life)  required  for  WT  frequency  to  reach  one-half  (Fig.  IB) 
(22)  with  varying  stimulation  rate  b  and  clearance  rate  p  (Fig.  S7). 
The  SIV  epitopes  studied  by  Bimber  et  al.  (9)  were  estimated  to 
have  a  WT  half-life  of  about  20  d  (22).  We  found  that,  within  our 
model,  the  WT  half-life  increases  with  increasing  p  and  b  in  this 
range:  An  increase  in  the  magnitude  of  selection  pressure,  rather 
than  accelerating  immune  escape,  suppresses  viral  population 
growth  more  effectively  and  reduces  the  chances  of  escape  muta¬ 
tion  (Fig.  2).  The  immune  escape  thus  becomes  more  pronounced 
when  viral  loads  in  the  chronic  phase  are  larger  with  more  fre¬ 
quent  mutations. 

Disease  Progression  Under  Immune  Response.  We  examined  general 
trends  of  viral  infection-immune  response  dynamics  within  our 
quasispecies  model  on  multiple-epitope  levels  (/  =  3)  with  varia¬ 
tions  of  key  model  parameters.  The  evolutionary  dynamics,  in 
particular,  is  critically  affected  by  p ,  the  clearance  rate  represent¬ 
ing  the  effectiveness  of  immune  response,  and  the  mutation 
rate  p.  We  identified  two  qualitatively  different  behaviors  based 
on  the  eventual  fate  of  viral  load/CTL  level:  in  one  regime 
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(“runaway,”  Fig.  2),  the  chronic  phase  after  the  resolution  of  the 
acute  infection  is  eventually  followed  by  a  runaway  growth  in  viral 
loads  (Fig.  24).  The  level  of  CTLs  also  rises  consistently  during 
the  chronic  phase  (Fig.  2 B).  This  disease  progression  is  accom¬ 
panied  by  an  increase  in  the  diversity  of  quasispecies,  or  the 
“spreading  of  clouds”  (Fig.  2 C).  The  time  scale  for  the  duration 
of  the  chronic  phase,  in  particular,  varied  sensitively  with  p  and  p 
(from  a  few  months  to  years)  (Fig.  24).  We  found  that  the  total 
number  of  phenotypes  per  epitope  (Fig.  2 C)  generally  goes  down 
as  the  number  of  epitopes  increases:  A  new  escape  mutant  has 
much  lower  probability  for  survival  with  multiple  epitopes  be¬ 
cause  it  shares  WT  phenotypes  on  other  parts  of  its  genome  with 
existing  strains,  and  therefore  is  susceptible  to  CTLs  that  are  al¬ 
ready  present. 

The  disease  progression  pattern  in  the  runaway  regime  corre¬ 
sponds  to  situations  where  the  immune  system  is  overwhelmed  by 
viral  growth,  either  shortly  after  infection  (there  is  no  discernible 
chronic  phase  for  sufficiently  small  p)  or  after  a  long  chronic 
phase  and  accumulation  of  escape  mutants.  The  latter  case  clo¬ 
sely  matches  the  characteristic  HIV-1  disease  pathogenesis  (7), 
revealing  the  major  role  played  by  the  uncontrolled  diversifica¬ 
tion  of  the  quasispecies  that  overwhelms  the  CTL  response.  This 
feature  was  first  suggested  by  Nowak  and  May  (19),  who  coined 
the  term  “diversity  threshold”  based  on  a  model  assuming  ran¬ 
dom  appearances  of  mutants.  Our  results  show  that  the  basic  qua¬ 
sispecies  dynamics  under  selection  pressure  provide  a  realistic 
description  of  this  phenomenon. 

If  p  is  sufficiently  large  and  p  sufficiently  small,  one  enters  a 
different  regime  (“extinction,”  Fig.  3),  where  the  acute  and 
chronic  phases  lead  to  viral  loads  that  decrease  either  rapidly 
or  asymptotically  with  time.  As  populations  shrink  in  size,  their 
dynamics  become  increasingly  stochastic,  and  extinction  (a  com¬ 
plete  clearance  of  infection)  occurs  when  the  viral  load  reaches 
zero.  This  behavior  is  inherently  stochastic  and  cannot  be  cap¬ 
tured  by  the  deterministic  approximation  in  Eq.  4.  The  time  re¬ 
quired  for  clearing  the  infection  (“extinction  time”)  typically  has 
a  broad  distribution,  increasing  with  decreasing  p  while  becoming 
infinite  above  a  threshold  p  (Fig.  44).  The  rapid  clearance  of  in¬ 
fection  within  approximately  10  d  corresponds  to  the  normal  state 
of  affairs  in  a  healthy  immune  system  against  viruses  such  as  in¬ 
fluenza.  As  shown  in  Fig.  3 A  (p  =  0.01  d_1),  however,  the  extinc¬ 
tion  time  can  also  reach  time  scales  approaching  a  patient’s 
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Fig.  3.  Typical  disease  progression  patterns  in  the  extinction  regime  from 
the  multi-epitope  quasispecies  model.  (/A)  Viral  load.  ( B )  Total  CTL  levels. 
(C)  Number  of  distinct  phenotypes  per  epitope  present  in  the  population. 
The  parameter  values  other  than  p  and  p  =  1.0  x  10-5  bp-1  were  the  same 
as  in  Fig.  2.  The  data  shown  are  individual  trajectories. 
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Fig.  4.  Crossover  between  the  runaway  and  extinction  regimes.  C4)  Varia¬ 
tion  of  extinction  time  with  mutation  rate  p  for  different  clearance  rates 
p  in  the  extinction  regime.  Error  bars  represent  one  standard  deviation.  Near 
the  threshold,  the  extinction  events  occur  increasingly  in  the  edge  of  viral 
load  fluctuations  (Fig.  3A).  ( B )  The  estimated  threshold  between  the  runaway 
and  extinction  regimes  in  the  p-  p  parameter  space  (defined  as  the  maxi¬ 
mum  p  for  which  the  extinction  time  converges  for  a  given  p).  The  arrow 
illustrates  the  dynamic  deterioration  of  immune  response  during  HIV-1  infec¬ 
tion.  Other  parameter  values  were  the  same  as  in  Fig.  2. 


lifetime.  This  feature  may  be  relevant  in  understanding  the  basis 
for  some  chronic  viral  infections  with  long  and  stable  setpoints 
(e.g.,  hepatitis  C),  although  the  high  variability  of  extinction  times 
we  observed  (Fig.  44)  likely  may  not  correspond  to  actual  pos¬ 
sible  clearances  of  such  chronic  infections. 

The  threshold  separating  the  runaway  (Fig.  2)  and  extinction 
(Fig.  3)  regimes  can  be  identified  by  the  maximum  mutation 
rate  for  which  the  extinction  time  remains  finite  (Fig.  4 A).  This 
error  threshold  for  the  immune  control  of  infection  is  analo¬ 
gous  to  Eigen’s  threshold  for  genomic  stability:  The  latter  is 
the  error  rate  above  which  replicators  cannot  maintain  a  stable 
master  sequence.  The  former  is  a  maximum  mutation  rate  the 
given  immune  system  can  suppress.  Fig.  4 B  shows  this  threshold 
as  a  function  of  p  and  p.  A  given  virus  would  have  roughly  the 
same  mutation  rate,  while  p  would  vary  with  individual  patients. 
The  sensitivity  of  disease  progression  with  p  is  consistent  with 
the  observation  that  even  when  the  fatality  rate  of  a  certain 
infection  is  known,  the  course  of  infection  in  a  patient  is  often 
unpredictable. 

The  case  for  HIV-1  is  unique  in  the  sense  that  without  treat¬ 
ment,  most  patients  eventually  progress  to  the  disease  stage.  One 
of  the  special  characteristics  of  HIV-1  is  that  it  targets  CD4  + 
(helper)  Tcells,  which  play  critical  roles  in  eliciting  and  mediating 
CTL  responses.  Within  the  context  of  our  model,  the  destruction 
of  CD4+  T  cells  would  lead  to  a  gradual  decrease  in  both  the  sti¬ 
mulation  rate  b  and  the  clearance  rate  p.  One  therefore  expects  a 
continual  deterioration  of  any  initial  effective  control  of  infec¬ 
tion,  represented  by  p  values  that  decrease  over  the  course  of  dis¬ 
ease  progression.  In  Fig.  4 B,  therefore,  even  though  patients  with 
stronger  immune  systems  may  have  p  values  initially  in  the  extinc¬ 
tion  regime,  the  continued  depletion  of  CD4+  cells  would  lower p 
and  cause  them  to  cross  the  threshold. 

Conclusions 

In  this  paper,  we  demonstrated  that  the  classic  quasispecies  the¬ 
ory  can  form  the  basis  of  a  quantitative  model  of  virus  evolution 
under  immune  selection.  One  of  the  apparent  challenges  in  de¬ 
veloping  such  a  model  is  that  viral  genomes,  while  relatively  small 
(approximately  104  bp),  still  constitute  a  huge  sequence  space. 
Here,  we  showed  that  a  set  of  epitopes  ( La  ~  10  amino  acids) 
can  be  considered  as  minimal  units  of  genomic  segments  on  which 
selection  acts.  In  standard  quasispecies  theory,  competition  be¬ 
tween  different  strains  arises  indirectly  from  the  constraint  of 
fixed  total  population  size.  In  our  model,  the  degradation  and 
removal  of  virions  occur  via  immune  response,  which  can  be  re¬ 
garded  as  a  concrete  specification  of  the  selection  forces.  In  this 
case,  indirect  competition  occurs  because  most  viral  strains  share 
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identical  phenotypes  in  a  large  fraction  of  their  epitopes  and  are 
constrained  by  common  groups  of  CTLs.  However,  the  model 
does  not  account  for  the  effects  of  direct  competitions  for  re¬ 
sources  (limited  number  of  host  cells  available  for  infection). 
We  have  additionally  explored  an  extended  model  that  includes 
host-cell  dynamics  (SI  Text),  whose  analysis  suggests  that  the  main 
qualitative  results  remain  valid  (Fig.  S8). 

Our  finding  that  the  relative  importance  of  immune  escape  is 
mainly  determined  by  the  viral  load  in  chronic  infection  may  have 
implications  to  vaccine  design  (42):  Strategies  aimed  at  prevent¬ 
ing  immune  escape,  such  as  targeting  epitopes  with  higher  fitness 
costs  to  mutations,  may  not  confer  more  benefits  than  those  at¬ 
tempting  to  boost  the  overall  immune  response.  We  also  note, 
however,  that  more  realistic  representations  of  T-cell  stimulation 
and  clearance  will  have  to  take  their  dependence  on  phenotypes 
into  account  (p  and  b  should  depend  on  amino  acid  sequences), 
because  mutations  affect  HLA-epitope-CTL  binding  affinities. 
Our  model  also  ignores  complex  epistatic  effects  coupling  muta¬ 
tions  on  the  same  and  different  epitopes  (43).  The  approach  pre¬ 
sented  here  likely  can  provide  a  foundation  for  a  more  compre¬ 
hensive  modeling  framework  to  tackle  such  global  genome-wide 
effects.  In  addition,  although  we  interpreted  Eqs.  3B-3D  strictly 
as  the  CTL-mediated  immune  response,  we  expect  similar  ap¬ 
proaches  to  be  applicable  for  antibody-based  responses. 

Methods 

Stochastic  Evolutionary  Dynamics.  Simulations  are  carried  out  by  applying  the 
Gillespie  algorithm  (26)  to  Eq.  3.  At  a  given  time,  a  dynamic  list  of  genotypes 
and  phenotypes  ( N  and  M  in  total,  respectively),  starting  with  a  single  WT  at 
the  initial  condition,  is  kept  and  updated,  instead  of  enumerating  all  possible 
sequences.  A  genotype  consists  of  the  nucleotide  sequence  of  a  set  of  epi¬ 
topes  (/  in  total),  each  with  the  same  length  L,  such  that  the  total  length  be¬ 
comes  /  x  L.  We  define  the  following  quantities 

air)  =  X  rini’  a‘S)  =  X + 

7=1  7=1 

UiC)  =  X(r7  +  b  +PCi)nP  a«]  =  aN  +  S  X  C(P)’ 

7=1  M 
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each  corresponding  to  the  replication,  stimulation,  clearance,  and  decay  re¬ 
actions  (Eq.  3),  respectively,  where  Cj  =  Hkc(ajk)  is  the  total  number  of  CTLs 
corresponding  to  any  of  the  epitope  phenotypes  belonging  to  the  genotype 
j.  A  random  number  0  <  r  <  1  is  drawn  to  determine  the  time  interval  A t  = 
(1/a)  ln(1/r)  before  the  next  reaction  event,  where  a  =  aj^.  A  second  ran¬ 
dom  number  r  is  drawn,  and  if  a£)  <  ar  <  a,x)  for  x  =  r,  s,  c,  the  replication, 
stimulation,  or  clearance  reaction  is  chosen  for  the  genotype  /,  respectively.  If 
aid_\  <ar  <  aid ^  instead,  the  decay  reaction  is  carried  out  for  the  CTL  with  phe¬ 
notype  a.  In  this  algorithm,  therefore,  a  given  CTL  contributes  equally  to  the 
clearance  probabilities  of  all  viral  strains  containing  its  phenotype.  This  pre¬ 
scription  generates  unbiased  trajectories  of  two  classes  of  objects,  the  viruses 
with  genotype  /  and  phenotype  ajk  and  CTLs  with  phenotype  a.  For  replica¬ 
tion  reactions,  the  chosen  genotype  /  is  mutated  with  probability  p  per  bp. 
Further  details  on  implementing  the  genotype-phenotype  lists  can  be  found 
in  SI  Text. 

Simulation  Conditions.  The  standard  genetic  code  was  used  to  translate 
mutant  genotypes  into  phenotypes.  The  reduced  fitness  sa  =  ra t  of  a  pheno¬ 
type  a  was  taken  as  follows:  sa  =  1  for  the  WT,  and  for  all  other  phenotypes  sa 
is  a  random  variable  with  the  following  Gaussian  distribution:  P(sa)  = 
exp{-[sa  -  f(da)]2]/2a2}/(2n)1/2G,  where  the  position  of  the  maximum 
f(da )  =  exp (-da/%)  is  an  exponentially  decreasing  function  of  the  distance 
da  >  1  (summed  over  all  epitopes;  in  units  of  amino  acids)  of  the  phenotype 
oc  to  the  WT.  The  parameters  %  and  a  are  the  measures  of  the  fitness  costs  of 
mutations  and  of  randomness  in  fitness  distributions,  respectively.  During  si¬ 
mulations,  a  newly  encountered  phenotype  is  dynamically  assigned  a  fitness 
value  based  on  this  distribution  with  negative  ra  values  replaced  by  zero.  Phe¬ 
notypes  containing  stop  codons  were  assigned  zero  fitness.  In  Fig.  1,  the  SIV 
epitope  Tat28-35SL8  with  the  WT  sequence  of  5-TCC  ACT  CCA  GAATCG  GCC 
AAC  CTG-3'  (STPESANL)  (38)  was  used  ( La  =  8  and  /  =  1).  In  Figs.  2-4,  random 
sequences  were  used  for  the  WT  and  simulations  were  performed  with  La  = 
10  and  /  =  3.  Typically,  quantities  were  averaged  over  102  ~  103  trajectories 
unless  otherwise  specified. 
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